保存隐私的神经网络(NN)推理解决方案最近在几种提供不同的延迟带宽权衡的解决方案方面获得了重大吸引力。其中,许多人依靠同态加密(HE),这是一种对加密数据进行计算的方法。但是,与他们的明文对应物相比,他的操作即使是最先进的计划仍然很慢。修剪NN模型的参数是改善推理潜伏期的众所周知的方法。但是,在明文上下文中有用的修剪方法可能对HE案的改善几乎可以忽略不计,这在最近的工作中也证明了这一点。在这项工作中,我们提出了一套新颖的修剪方法,以减少潜伏期和记忆要求,从而将明文修剪方法的有效性带到HE中。至关重要的是,我们的建议采用两种关键技术,即。堆积模型权重的置换和扩展,使修剪能够明显更多的密封性下文并分别恢复大部分精度损失。我们证明了我们的方法在完全连接的层上的优势,其中使用最近提出的称为瓷砖张量的包装技术填充了权重,该技术允许在非相互作用模式下执行Deep NN推断。我们在各种自动编码器架构上评估了我们的方法,并证明,对于MNIST上的小均值重建损失为1.5*10^{ - 5},我们将HE-SEAMABLE推断的内存要求和延迟减少了60%。
translated by 谷歌翻译
Automatic Speech Recognition (ASR) systems frequently use a search-based decoding strategy aiming to find the best attainable transcript by considering multiple candidates. One prominent speech recognition decoding heuristic is beam search, which seeks the transcript with the greatest likelihood computed using the predicted distribution. While showing substantial performance gains in various tasks, beam search loses some of its effectiveness when the predicted probabilities are highly confident, i.e., the predicted distribution is massed for a single or very few classes. We show that recently proposed Self-Supervised Learning (SSL)-based ASR models tend to yield exceptionally confident predictions that may hamper beam search from truly considering a diverse set of candidates. We perform a layer analysis to reveal and visualize how predictions evolve, and propose a decoding procedure that improves the performance of fine-tuned ASR models. Our proposed approach does not require further training beyond the original fine-tuning, nor additional model parameters. In fact, we find that our proposed method requires significantly less inference computation than current approaches. We propose aggregating the top M layers, potentially leveraging useful information encoded in intermediate layers, and relaxing model confidence. We demonstrate the effectiveness of our approach by conducting an empirical study on varying amounts of labeled resources and different model sizes, showing consistent improvements in particular when applied to low-resource scenarios.
translated by 谷歌翻译
We study the ability of foundation models to learn representations for classification that are transferable to new, unseen classes. Recent results in the literature show that representations learned by a single classifier over many classes are competitive on few-shot learning problems with representations learned by special-purpose algorithms designed for such problems. We offer an explanation for this phenomenon based on the concept of class-features variability collapse, which refers to the training dynamics of deep classification networks where the feature embeddings of samples belonging to the same class tend to concentrate around their class means. More specifically, we examine the few-shot error of the learned feature map, which is the classification error of the nearest class-center classifier using centers learned from a small number of random samples from each class. Assuming that the classes appearing in the data are selected independently from a distribution, we show that the few-shot error generalizes from the training data to unseen test data, and we provide an upper bound on the expected few-shot error for new classes (selected from the same distribution) using the average few-shot error for the source classes. Additionally, we show that the few-shot error on the training data can be upper bounded using the degree of class-features variability collapse. This suggests that foundation models can provide feature maps that are transferable to new downstream tasks even with limited data available.
translated by 谷歌翻译
Training a generative model on a single image has drawn significant attention in recent years. Single image generative methods are designed to learn the internal patch distribution of a single natural image at multiple scales. These models can be used for drawing diverse samples that semantically resemble the training image, as well as for solving many image editing and restoration tasks that involve that particular image. Here, we introduce an extended framework, which allows to simultaneously learn the internal distributions of several images, by using a single model with spatially varying image-identity conditioning. Our BlendGAN opens the door to applications that are not supported by single-image models, including morphing, melding, and structure-texture fusion between two or more arbitrary images.
translated by 谷歌翻译
Denoising diffusion models (DDMs) have led to staggering performance leaps in image generation, editing and restoration. However, existing DDMs use very large datasets for training. Here, we introduce a framework for training a DDM on a single image. Our method, which we coin SinDDM, learns the internal statistics of the training image by using a multi-scale diffusion process. To drive the reverse diffusion process, we use a fully-convolutional light-weight denoiser, which is conditioned on both the noise level and the scale. This architecture allows generating samples of arbitrary dimensions, in a coarse-to-fine manner. As we illustrate, SinDDM generates diverse high-quality samples, and is applicable in a wide array of tasks, including style transfer and harmonization. Furthermore, it can be easily guided by external supervision. Particularly, we demonstrate text-guided generation from a single image using a pre-trained CLIP model.
translated by 谷歌翻译
A master face is a face image that passes face-based identity authentication for a high percentage of the population. These faces can be used to impersonate, with a high probability of success, any user, without having access to any user information. We optimize these faces for 2D and 3D face verification models, by using an evolutionary algorithm in the latent embedding space of the StyleGAN face generator. For 2D face verification, multiple evolutionary strategies are compared, and we propose a novel approach that employs a neural network to direct the search toward promising samples, without adding fitness evaluations. The results we present demonstrate that it is possible to obtain a considerable coverage of the identities in the LFW or RFW datasets with less than 10 master faces, for six leading deep face recognition systems. In 3D, we generate faces using the 2D StyleGAN2 generator and predict a 3D structure using a deep 3D face reconstruction network. When employing two different 3D face recognition systems, we are able to obtain a coverage of 40%-50%. Additionally, we present the generation of paired 2D RGB and 3D master faces, which simultaneously match 2D and 3D models with high impersonation rates.
translated by 谷歌翻译
联合学习(FL)是使用Edge设备上可能可用的私人数据训练机器学习模型的新兴范式。 FL的分布式操作引起了集中式机器学习中未遇到的挑战,包括需要保留本地数据集的隐私以及由于重复交换更新模型而导致的通信负载。这些挑战通常通过引起更新模型的某些失真的技术来单独解决,例如当地差异隐私(LDP)机制和有损压缩。在这项工作中,我们提出了一种方法创造的联合隐私增强和量化(JOPEQ),该隐私和量化共同实现了FL环境中的有损压缩和隐私增强。特别是,Jopeq利用基于随机晶格的矢量量化,这是一种通用压缩技术,其副产品失真在统计学上等同于加性噪声。通过使用专用的多元隐私保护噪声来增强模型更新,可以利用这种失真来增强隐私。我们表明,JOPEQ在持有所需的隐私级别的同时,根据所需的比特率同时量化数据,而不会特别影响学习模型的实用性。这是通过分析的LDP保证,失真和收敛范围的推导以及数值研究所示的。最后,我们从经验上断言,乔普克(Jopeq)拆除了已知的普通攻击,以利用隐私泄漏。
translated by 谷歌翻译
最近有很多不可能的结果表明,在与对抗对手的马尔可夫游戏中最小化的遗憾在统计学上和计算上是棘手的。然而,这些结果都没有排除在所有各方采用相同学习程序的假设下,遗憾最小化的可能性。在这项工作中,我们介绍了第一种(据我们所知)在通用马尔可夫游戏中学习的算法,该算法在所有代理商执行时提供了sublinear后悔保证。我们获得的边界是为了置换遗憾,因此,在此过程中,意味着融合了相关的平衡。我们的算法是分散的,计算上有效的,并且不需要代理之间的任何通信。我们的主要观察结果是,在马尔可夫游戏中通过策略优化的在线学习基本上减少了一种加权遗憾的最小化形式,而未知权重由代理商的策略顺序的路径长度确定。因此,控制路径长度会导致加权的遗憾目标,以提供足够的适应性算法提供统一的后悔保证。
translated by 谷歌翻译
EC-KITY是用于执行进化计算(EC)的全面Python库,根据GNU通用公共许可证v3.0许可,并与Scikit-Learn兼容。考虑到现代软件工程和机器学习集成,EC-KITY可以支持所有流行的EC范式,包括遗传算法,遗传编程,协同进化,进化多目标优化等等。本文概述了该软件包的概述,包括设置EC实验,体系结构,主要功能以及与其他库的比较的便利性。
translated by 谷歌翻译
我们考虑设计统一稳定的一阶优化算法以最小化的问题。统一的稳定性通常用于获得优化算法的概括误差范围,我们对实现它的一般方法感兴趣。对于欧几里得的几何形状,我们建议采用黑盒转换,给定平滑的优化算法,它产生了算法的均匀稳定版本,同时将其收敛速率保持在对数因素上。使用此减少,我们获得了一种(几乎)最佳算法,以平滑优化,并通过收敛速率$ \ widetilde {o}(1/t^2)$和均匀的稳定性$ O(t^2/n)$,解决一个开放的问题Chen等。(2018);阿蒂亚和科伦(2021)。对于更一般的几何形状,我们开发了一种镜下下降的变体,以平滑优化,收敛速率$ \ widetilde {o}(1/t)$和统一的稳定性$ O(t/n)$(t/n)$,留下了开放的问题转换方法如欧几里得情况。
translated by 谷歌翻译